Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A survey on Arabic character segmentation

Identifieur interne : 000208 ( Main/Exploration ); précédent : 000207; suivant : 000209

A survey on Arabic character segmentation

Auteurs : Yasser M. Alginahi [Arabie saoudite]

Source :

RBID : Pascal:14-0004360

Descripteurs français

English descriptors

Abstract

Arabic character segmentation is a necessary step in Arabic Optical Character Recognition (OCR). The cursive nature of Arabic script poses challenging problems in Arabic character recognition; however, incorrectly segmented characters will cause misclassifications of characters which in turn may lead to wrong results. Therefore, off-line Arabic character segmentation is a difficult research problem and little research has been achieved in this area in the past few decades. This is due to both the cursive nature of Arabic writing in both printed and handwritten forms and the scarcity of Arabic databases and dictionaries. Most of the character recognition methods used in the recognition of Arabic characters are adopted from available methods used on handwritten Latin and Chinese characters; however, other methods are developed only for Arabic character segmentation. This survey presents the description of the Arabic script characteristics with an overview on OCR systems and a comprehensive review mainly on off-line printed Arabic character segmentation techniques.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">A survey on Arabic character segmentation</title>
<author>
<name sortKey="Alginahi, Yasser M" sort="Alginahi, Yasser M" uniqKey="Alginahi Y" first="Yasser M." last="Alginahi">Yasser M. Alginahi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Computer Science, College of Computer Science and Engineering, Taibah University, P.O. Box. 344, Al-Madinah Al-Munawarrah</s1>
<s3>SAU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Arabie saoudite</country>
<wicri:noRegion>Department of Computer Science, College of Computer Science and Engineering, Taibah University, P.O. Box. 344, Al-Madinah Al-Munawarrah</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">14-0004360</idno>
<date when="2013">2013</date>
<idno type="stanalyst">PASCAL 14-0004360 INIST</idno>
<idno type="RBID">Pascal:14-0004360</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000031</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000733</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000051</idno>
<idno type="wicri:doubleKey">1433-2833:2013:Alginahi Y:a:survey:on</idno>
<idno type="wicri:Area/Main/Merge">000211</idno>
<idno type="wicri:Area/Main/Curation">000208</idno>
<idno type="wicri:Area/Main/Exploration">000208</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">A survey on Arabic character segmentation</title>
<author>
<name sortKey="Alginahi, Yasser M" sort="Alginahi, Yasser M" uniqKey="Alginahi Y" first="Yasser M." last="Alginahi">Yasser M. Alginahi</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Department of Computer Science, College of Computer Science and Engineering, Taibah University, P.O. Box. 344, Al-Madinah Al-Munawarrah</s1>
<s3>SAU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Arabie saoudite</country>
<wicri:noRegion>Department of Computer Science, College of Computer Science and Engineering, Taibah University, P.O. Box. 344, Al-Madinah Al-Munawarrah</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint>
<date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Arabic</term>
<term>Character recognition</term>
<term>Data analysis</term>
<term>Dictionaries</term>
<term>Ideogram</term>
<term>Image processing</term>
<term>Image segmentation</term>
<term>Indirect method</term>
<term>Manuscript character</term>
<term>Off line</term>
<term>On line</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Printed form</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance optique caractère</term>
<term>Reconnaissance caractère</term>
<term>Caractère manuscrit</term>
<term>Reconnaissance forme</term>
<term>Traitement image</term>
<term>Analyse donnée</term>
<term>Dictionnaire</term>
<term>En ligne</term>
<term>Arabe</term>
<term>Hors ligne</term>
<term>Formule imprimée</term>
<term>Idéogramme</term>
<term>Méthode indirecte</term>
<term>Segmentation image</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Dictionnaire</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Arabic character segmentation is a necessary step in Arabic Optical Character Recognition (OCR). The cursive nature of Arabic script poses challenging problems in Arabic character recognition; however, incorrectly segmented characters will cause misclassifications of characters which in turn may lead to wrong results. Therefore, off-line Arabic character segmentation is a difficult research problem and little research has been achieved in this area in the past few decades. This is due to both the cursive nature of Arabic writing in both printed and handwritten forms and the scarcity of Arabic databases and dictionaries. Most of the character recognition methods used in the recognition of Arabic characters are adopted from available methods used on handwritten Latin and Chinese characters; however, other methods are developed only for Arabic character segmentation. This survey presents the description of the Arabic script characteristics with an overview on OCR systems and a comprehensive review mainly on off-line printed Arabic character segmentation techniques.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Arabie saoudite</li>
</country>
</list>
<tree>
<country name="Arabie saoudite">
<noRegion>
<name sortKey="Alginahi, Yasser M" sort="Alginahi, Yasser M" uniqKey="Alginahi Y" first="Yasser M." last="Alginahi">Yasser M. Alginahi</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000208 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000208 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:14-0004360
   |texte=   A survey on Arabic character segmentation
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024